Advanced join strategies for large-scale distributed computation
نویسندگان
چکیده
منابع مشابه
Advanced Join Strategies for Large-Scale Distributed Computation
Companies providing cloud-scale data services have increasing needs to store and analyze massive data sets (e.g., search logs, click streams, and web graph data). For cost and performance reasons, processing is typically done on large clusters of thousands of commodity machines by using high level scripting languages. In the recent past, there has been significant progress in adapting well-know...
متن کاملDistributed Computation of Large-scale Graph Problems
Motivated by the increasing need for fast distributed processing of large-scale graphs such as the Web graph and various social networks, we study a number of fundamental graph problems in the message-passing model, where we have k machines that jointly perform a computation on an arbitrary n-node (typically, n ≫ k) input graph. The graph is assumed to be randomly partitioned among the k ≥ 2 ma...
متن کاملEvaluation of Join Strategies for Distributed Mediation
Three join algorithms are evaluated in an environment with distributed main-memory based mediators and data sources. A streamed ship-out join ships bulks of tuples to a mediator near a data source, followed by post-processing in the client. An extended streamed semi-join in addition builds a main-memory hash index in the client mediator. A ship-in algorithm materializes and joins the data in th...
متن کاملComputation and Data Scheduling for Large-Scale Distributed Computing
In high-energy physics, bioinformatics, and other disciplines, we encounter applications involving numerous, loosely coupled jobs that both access and generate large data sets. So-called Data Grids seek to harness geographically distributed resources for such large-scale data-intensive problems. Yet effective scheduling in such environments is challenging, due to a need to address a variety of ...
متن کاملVery Large Scale OWL Reasoning through Distributed Computation
Due to recent developments in reasoning algorithms of the various OWL profiles, the classification time for an ontology has come down drastically. For all of the popular reasoners, in order to process an ontology, an implicit assumption is that the ontology should fit in primary memory. The memory requirements for a reasoner are already quite high, and considering the ever increasing size of th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the VLDB Endowment
سال: 2014
ISSN: 2150-8097
DOI: 10.14778/2733004.2733020